Comparing the Use of Edited and Unedited Text in Parser Self-Training
نویسندگان
چکیده
We compare the use of edited text in the form of newswire and unedited text in the form of discussion forum posts as sources for training material in a self-training experiment involving the Brown reranking parser and a test set of sentences from an online sports discussion forum. We find that grammars induced from the two automatically parsed corpora achieve similar Parseval fscores, with the grammars induced from the discussion forum material being slightly superior. An error analysis reveals that the two types of grammars do behave differently.
منابع مشابه
A new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملStudying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملExploring self training for Hindi dependency parsing
In this paper we explore the effect of selftraining on Hindi dependency parsing. We consider a state-of-the-art Hindi dependency parser and apply self-training by using a large raw corpus. We consider two types of raw corpus, one from same domain as of training and testing data and the other from different domain. We also do an experiment, where we add small gold-standard data to the training s...
متن کاملInvestigating the Effect of Educational Text Messages on Self-Care in Hypertensive Patients in a Hypertension Clinic in Kerman
Introduction: Self-care in chronic diseases implies the study and control of disease symptoms, maintaining a healthy lifestyle, and daily functioning. The objective of this study was to investigate the effect of educational text messages on self-care in hypertensive patients in one of the hypertension clinics in Kerman in 2020. Method: The statistical population of this quasi-experimental study...
متن کاملInvestigating the Effect of Educational Text Messages on Self-Care in Hypertensive Patients in a Hypertension Clinic in Kerman
Introduction: Self-care in chronic diseases implies the study and control of disease symptoms, maintaining a healthy lifestyle, and daily functioning. The objective of this study was to investigate the effect of educational text messages on self-care in hypertensive patients in one of the hypertension clinics in Kerman in 2020. Method: The statistical population of this quasi-experimental study...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011